The Instance Easiness of Supervised Learning for Cluster Validity
نویسنده
چکیده
“The statistical problem of testing cluster validity is essentially unsolved” [5]. We translate the issue of gaining credibility on the output of un-supervised learning algorithms to the supervised learning case. We introduce a notion of instance easiness to supervised learning and link the validity of a clustering to how its output constitutes an easy instance for supervised learning. Our notion of instance easiness for supervised learning extends the notion of stability to perturbations (used earlier for measuring clusterability in the un-supervised setting). We follow the axiomatic and generic formulations for cluster-quality measures. As a result, we inform the trust we can place in a clustering result using standard validity methods for supervised learning, like cross validation.
منابع مشابه
Wised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge
The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...
متن کاملWised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge
The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...
متن کاملNew Cluster Validation with Input-Output Causality for Context-Based Gk Fuzzy Clustering
In this paper, a cluster validity concept from an unsupervised to a supervised manner is presented. Most cluster validity criterions were established in an unsupervised manner, although many clustering methods performed in supervised and semi-supervised environments that used context information and performance results of the model. Context-based clustering methods can divide the input spaces u...
متن کاملContexts-Constrained Multiple Instance Learning for Histopathology Image Analysis
Histopathology image analysis plays a very important role in cancer diagnosis and therapeutic treatment. Existing supervised approaches for image segmentation require a large amount of high quality manual delineations (on pixels), which is often hard to obtain. In this paper, we propose a new algorithm along the line of weakly supervised learning; we introduce context constraints as a prior for...
متن کاملMultiple Instance Learning with the Optimal Sub-Pattern Assignment Metric
Multiple instance data are sets or multi-sets of unordered elements. Using metrics or distances for sets, we propose an approach to several multiple instance learning tasks, such as clustering (unsupervised learning), classification (supervised learning), and novelty detection (semi-supervised learning). In particular, we introduce the Optimal Sub-Pattern Assignment metric to multiple instance ...
متن کامل